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When studying the information leakage in programs or protocols, a natural question arises: "what is 
the worst case scenario?". This problem of identifying the maximal leakage can be seen as a channel 
capacity problem in the information theoretical sense. In this paper, by combining two powerful 
theories: Information Theory and Karush-Kuhn-Tucker conditions, we demonstrate a very general 
solution to the channel capacity problem. Examples are given to show how our solution can be 
applied to practical contexts of programs and anonymity protocols, and how this solution generalizes 
previous approaches to this problem. 

1 Introduction 

As emphasized in the existing literature, no electronic system can guarantee perfect confidentiality or 
anonymity [19]. Hence, measuring the leakage of confidential information is a pressing but increasingly 
challenging issue. The ability to preemptively assess possible information leaks is crucial for designing 
and understanding a system that contains information which ought to be protected CD. 

Information Theory [25] provides a general method for measuring information flow in information 
channels, and extends to quantify the loss of confidentiality and anonymity. A number of previous works 
have addressed and measured the channel capacity of information leakage channels, which describes the 
worst-case leakage. Recently a novel technique to measure the channel capacity of anonymity protocols 
and programs using Lagrange multipliers has been proposed in ||2T1 [7j : this setting is able to answer 
questions like: "what is the maximum leakage of a system where a random string is 1000 times less 
likely to be the secret than a dictionary word" i.e. an equality constraint like p ra nd = 1000p word [[] 

In order to analyze a much wider range of systems and scenarios, inequality constraints ought to be 
supported. An example of such constraint is: "the password is over 1000 times more likely to be a word 
from a dictionary than a meaningless string", i.e. p r and < 1000p word : these inequality constraints cannot 
be solved using lagrangians. Therefore, we introduce Karush-Kuhn-Tucker (KKT) conditions to enable 
inequality constraints for deriving the channel capacity, and present a set of theorems and propositions 
which can be readily applied. This makes the approach more powerful and enables it to deal with a much 
wider spectrum of cases, as demonstrated later on in this paper. Further, we believe that this approach, 
orthogonal to the probabilistic methods which have dominated protocol security analysis |[T2l ITT1 l24l . 
will provide novel and more practical results to the research community. 

The paper is organized as follows: the next subsection discusses existing literature and the back- 
ground is introduced in Section [2] In Section [3] we briefly describe the theorems and propositions for 
channel capacity using Karush-Kuhn-Tucker conditions with full proofs. We show that our method can 

'By maximum leakage we mean the maximum number of bits leaked. Notice that this is different from the maximum 
percentage of the secret leaked. 



M. Boreale, S. Kremer (Eds.): 

Security Issues in Concurrency (SecCo'09) 

EPTCS 7, 2009, pp. l-[l5] doi |l0.4204/EPTCS.7.l| 



© Han Chen & Pasquale Malacaria 
This work is licensed under the 
Creative Commons Attribution License. 



2 



Studying Maximum Information Leakage Using Karush-Kuhn-Tucker Conditions 



be applied to programs and protocols in Section |4] Finally, we provide concluding remarks and discuss 
future works in Section [5] 

1.1 Related Works 

This work extends from previous works by Chen and Malacaria ETI ITI. Information Leakage is measured 
using the same Information Theoretical definitions used by a number of authors ||8l [T9l l3l [T5l fl6l . and 
follows pioneering works by Denning lflOl . Gray 021, Mclean[18] and Millen ll22l . A recent alternative 
Information Theoretical definition of leakage has been proposed by Smith ll26l in terms of min entropy; 
in the context of protocols those ideas have been investigated by J4]|. A discussion of the relation between 
min entropy and Shannon entropy relevant to the context of this work can be found in EOll . In a program 
analysis context channel capacity has been recently investigated in f23l . 

Channel capacity of anonymity protocols in a restricted context has been characterized by Chatzikoko- 
lakis, Palamidessi and Panangaden [6]. However, their method applies to protocols with "symmetric" 
properties. These restrictions are overcome in |[2"T1 13 where Lagrange multipliers are used to compute 
the maximum leakage of deterministic programs and anonymity protocols with additional equality con- 
straints. Blahut [2] mentioned KKT conditions while proposing his iterative algorithm for approximate 
channel capacity. However, the use of KKT conditions in the context of this work is original. 

There is a large set of work on anonymity protocols using probabilistic techniques lfT2l [TT1 1241 . 
A probabilistic approach would assume a certain kind of distribution to work out an expectation of 
anonymity in a given model. In comparison, our method allows for the use of more flexible relationships 
to track the maximum leakage, which is a pressing problem that remains largely unsolved. Whilst it is 
known that information theoretical and probabilistic notions are related, the extent of this relationship 
requires further investigation. 

2 Background 

In this work we refer to a program or a protocol as an information leakage channel. We define an 
information leakage channel as a triple 

where the input, J^, is a set of confidential information, and the output, G, is a set of observations. 
To introduce probabilities we use two random variables: h for J4? and O for G respectively. We also 
denote members of ^ as h[ G , and members of 6 as o ; € . describes the conditional probability 
between the two random variables: 

<l>=P(0\h) 

In deterministic channels, one input h t can only produce one output oj thus (f)jj = 1. 

With this definition, both programs and anonymity protocols can be seen as information leakage 

channels. In general, information leakage channel has three elements: confidential information as inputs, 

public information as observations and the corresponding probabilities between them. 

The triple above: 0, 0) can be represented by a probability matrix: rows describe elements of 
, columns describe elements of and the value at position (hi,oj) is the conditional probability <pjj. 

This is the chance of observing oj given hi as the input. 
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2.1 Background on Karush-Kuhn-Tucker Conditions 

Karush-Kuhn-Tucker (KKT) conditions [17] generalize the Lagrange method for rinding the extrema 
of a function subject to a family of constraints: while Lagrange multipliers consider only equality con- 
straints, KKT conditions allow for general inequality constraints. We refer the reader to ETl [7J for a 
background on Lagrange Multipliers in this context. 

2.2 A Simple Example 

We will illustrate the use of the method by a simple example below. 
Suppose we want to maximize the following function: 

f(x,y)=xy 

subject to the inequality constraint 

x+y < 8 

First we construct the Lagrange function which combines the original function waiting to be maximized 
and the constraint 

L(x,y) = xy + X(8—x — y) 

where A is a number which indicates the weight associated with the constraint, for example ignoring the 
constraint is equivalent to setting A = 0. 

Formally, the term X which is the Lagrange multipliers and the Lagrange technique are used in order 
to find the maximum of the function by differentiating on x,y and X. 

Using KKT we get the optimal solution for the original optimization problem by solving the follow- 
ing equations: 

— , =y-X=0, — ^ = x — X = 0, A(8 — x— y) = 

ox by 

We deduce the additional constraint 

x+y < 8, x = y = X 

We use the conclusion x = y = X to replace x and y in the original function and get the maximal value 

Max/(jc,y) = X 2 

Notice there is another equation we didn't use so far X (8 — x — y) = 0, from this and the constraint 
i+}i<8we get two cases 

X = 0, or .x + y = 8 

It's easy to see that X = is a saddle point because this value can not give the local maximum. Then we 
use X replace the variables x and y in the second case and get 

2X = 8 => X = 4 

It is then easy to derive the values for the other variables i.e. 

x = 4, y = 4 

Now the values x = 4, y = 4 do satisfy the constraint and also are the values that maximize the original 
function 

Maxf(x,y) =xy = 16 



4 



Studying Maximum Information Leakage Using Karush-Kuhn-Tucker Conditions 



2.3 Theoretical Basis of Karush-Kuhn-Tucker Conditions 

We consider the problem of rinding the extrema of a function / subject to a family of constraints C\<i< m 
where Q is an inequality of the form gi(x) > bi. 

The first step is to construct the Lagrange function L(x, X) where A is a Lagrange multiplier for the 
inequality constraint which is similar as the multiplier for equality constraint. The inequality constraints 
are expressed in the form gi(x) — b\ > and then we introduce the X associated with the constraints. 

In a general setting let L(x,X) be the Lagrangian of a function / subject to a family of constraints 
C\<i< m (Q = gi(x) > bi), i.e. 

L(x,X)=f(x) + £ Xiigi^-bi) 

\<i<m 

The basic result justifying KKT method is the following: 

Theorem 2.1 Assume the vector x* = (xf , . . . ,x*) maximizes (or minimizes) the continuous function f(x) 
subject to the constraints (gi(x) > bi)\<i< m . Then either 

1. the vectors iygi{x*))\<i< m are linearly dependent, or 

2. there exists a vector X* = (X* , . . . , X£) which is an optimal solution for the original optimization 
problem satisfying the following conditions 

VL(X*,x*) = i.e. 

Ax*) = 0) 1<i<n 

OXi 

and 

X*( gi (x*)-bi) = 0, gi (x*) > bi, Xi > 

where V is the gradient and these conditions are called KKT conditions. 

The condition A,- > implies non-negative Lagrange multiplier and X*(gi(x*) — bi) = implies two 
cases: 



gi{x*)=bi (1) 
gi(x*)>bi^Xi = (2) 

2.4 Results of Lagrange Multipliers: A Short Review 

We now give a short review of the results in ETl lTll. These works use Lagrange multiplier to solve the 
channel capacity in programs and anonymity protocols with equality constraints. 

Theorem 2.2 In probabilistic channels, the probabilities h{ maximizing l(h;0) subject to the family of 
constraint ( c ^k)keK ore given by solving in hj the equations 

£ 0.v,-hi(— )-l+I>/ a = 

and the constraints (^?k)keK- 

where (j) s j is the probability of observing o s when the input is hf, 0/ denotes the set of observations 
possible for the secret hf, f t k is the factor of /j, in the k t h constraint. 
Using the probabilities h\ we can work out the channel capacity. 
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Proposition 2.3 The channel capacity is given by 



If the system is deterministic, the formula in Theorem 2.2 can be simplified to 

ln( 0/ )-l+I>A* = 

k 

Moreover, in the case of the single constraint hi = 1 the channel capacity of deterministic infor- 
mation leakage channels can be further simplified. 

Proposition 2.4 The channel capacity of deterministic information leakage channels without any addi- 
tional constraint is given by 

J(l-Ao) 

where d = 

From Theorem 2.2 we can know that Proposition |2.4| implies the well known fact that the channel capac- 
ity of unconstrained deterministic programs is the log of the number of possible outputs. 



3 Channel Capacity using Karush-Kuhn-Tucker Conditions 

3.1 Constraints 

Often the attacker's knowledge about the secret can be expressed in terms of inequalities: for example, 
"a unix password is 100 times more likely to be a word from a dictionary than a meaningless string". 
We hence need KKT conditions to compute channel capacity in this context. Remember that there is 
always at least one constraint for the input distribution requiring that the sum of their probabilities is 
1; we denote this constraint as Co- Additional constraints are used to specify the conditions of inputs 
needed to satisfy: we use Q for these conditions. 

c = £/i, = i 

C k = g k {hi)>F k (k>0) 

where F k are constants and gk(hi) are "statistics" or expectations , i.e. linear inequality expressions in 
the form of 

8k{h) = Y^ h ifi,k 

i 

KKT conditions only provides precise solutions for non strict inequalities; for strict inequalities, we can 
only provide an approximate solution. 

3.2 Theory and Proof 
Convention: 

As previously explained, we denote /?,• as the i-th possible value that the variable h can assume. Also, 
Oj denotes the j-th possible value for the observation variable O. Each possible event hi has a given 
probability ju (hi). To ease the exposition we will use hj both for the event hj and for its probability fxihi), 
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and similarly oj for il(oj). However when it is clear from the context we may use hi for the i-th value of 
the variable h, i.e. h = v,-. The context will disambiguate what meaning is intended. 

As usual we use the conditional probability of <j) k j for the probability of observing o k given the input 
hi. Using Information Theory we have: 

l(h;0) = H{h)-H{h\0) 

= ff(fc)+£ot£(fci|o*)iog(fci|oiO 

k i 

= H(h) + Y^{h h o k )\og{hi\o k ) 

i,k 

= -£^log(^) + I^log(^) 

i It ° k 

= - £ hi<j) k j log(fy) + £ h ih,i !og( h} ^ £ ) 

= £Mwiog(— ) 



Notice that 



y^^log(^) = d £ £ MvM" )) 

hk K o s eOi s o s eO h h r ePi s 



where d = and 



/l = {/i^ S! ^0, Ol GO ; ,^i} 



where in the formula ; denotes the set of observations possible for the secret hi (i.e. the set of non zero 
observations compatible with input hi). 

Assuming a set of constraints (^kjkeK = gk(hi) > F k , the Lagrange function hence becomes 

L(hi) = I(h;0)+dY,X k (£hifi, k -F k ) 

k i 

= d £ hi<j> s ,i\n^-+d £ K^M^+dZ^CLhifit-Ft) 

o„edi ° s o s e6iM,ePi ° s k ' 

where d = ^is used to convert the logarithm in base 2 log into natural logarithm In. 
As mentioned earlier, we always assume the constraint Co = = 1. 
Using KKT the maximum L(/i,) is given by the following theorem: 

Theorem 3.1 In information leakage channels, the probabilities hi maximizing I(h; O) subject to the 
family of constraint (ff^keK = gk( n i) > F k are given by solving in hi the following system of inequalities: 

£ <j) sJ ln( ^ ) - 1 + £ hf,k = A X k > 0, g k {hi) > F k 
o s ed, ° s k 

or 

£ " 1 +^0 =0Ag k (hi) > F k 

o s eOi 
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Proof: Recall that the KKT conditions are 

<£<*•) -<•>«.* 

i i 

Compared to the KKT conditions for equality constraints we found that there are three additional ones: 

h(£ h ifi,k~Fk) = 0, Y,hif i>k >F k , X k >0 

i i 

other than ^ 

= Oi<k„ 

which actually represents the constraints 

I>A*-** = o 

Firstly we simplify the three additional constraints as 

hifu -F k ) = _ . 

v u f -> F ->, Li Mi* - n A A k > U or 

X> L^/a>^A4 = 

Combine the result with the derivative condition 

(§(*') = 0) lsis „ 
we can have the new pair of conditions for maximizing L 

(^(A*) = 0,A* = 0)i< i < fl A^>0or 
( = 0) !<,-<„ A £>/a > F t A h = 

We first consider the derivative J£ (A* ) because this is the only derivative that needs to be satisfied. 
This process is the same as equality constraints. 

So, the maximum can be found by solving for all / 

SL(hj)_ Q 

Recall our previous analysis of the Lagrange function: 

L{hi) = I(h;0) + dY,h(E h iAk-Fk) 

k i 

o s edj ° s o s edi,h r ePi ° s k i 
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We solve the derivatives for each item in the Lagrange function. For the first item: 

oh ' o, o s 

o s eOt 

For the second item: 

8hi 

S(dl IOie ^ A ^h r ^\n^-l IOiediAePi hr^s^o s ) 



8h 

0.sy 



0-d Y h r^,r 



o s eOi,h r ePi 



s 



Because for h r € Pi, /j r s , r ln0 s r does not include any hi, then the derivative by hi is 0. 
We combine the first two items and then simplify the expression as follows: 

d Y 0v ln (— ) -d*^— -d Y h r<l>s,r— = 
o s e6i ° s ° s o s e6i,h r ePi ° s 

d Y ln(— )- 2- M— ^ + ) 



o s €Oi " o s eOi,h r ePi 



J £ (0,/ln( 0, ' i 0,,) = 

d(£ ^ln(^)-l) 

o s eOi 



For the third item, the result is a linear function of hf. 

8(dZk h(Li hi fa -F k )) 



5h . dYhAk 



k 

From these results we conclude that maxl(h; O) can be achieved by solving hi in the following equa- 
tion system: 

hi 

d( y - 1) +dY^f.k = o 

o s e6, ° s k 

Y <Mn( — )-l+I>/u = 

o s ed, ° s k 
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As mentioned before, this equation needs to satisfy the following condition: 

A*>0or %hif i/c >F k Ak = 

i 

when Xk = 0, the equation can be simplified to 

£ ^,ln(^)-l+Ao = 

- o s 
o s eOi 

where Ao is for the constraint hj = 1 . 

So, we arrive at the conclusion that to maximize L, the following equations need to be solved with 
the constraints: 

£ h,M—) ~ 1 +E**A* = OA A, > or 
£ </>,,• ln( ^ ) - 1 + Ao = A £ ftifo > 

The proof completes. 

If the system is completely deterministic, that is one input can only generate one "observation", then 
the o ; 's are defined in terms of the high inputs hj. that generate the "observation", i.e. 

Oj = h jl +--- + h jn 

Notice then that (hj\oj) = ^ h 'f^ and that (hi,Oj) = hi if /j, generates the observation oj otherwise is 0. 

Because there is only one possible observation in the model associated with a high input h = v,-; 
denoted as 0{h{) and defined as o,- = }i{0{hi)) 

Hence, we can simplify the Theorem |3.1| to the following proposition by replacing 0, ^ with 1: 

Proposition 3.2 In deterministic channels, the probabilities hi maximizing l{h\0) subject to the family 
of constraint Ctfk)keK = 8k( n i) > Ft are given by solving in hj the following system of inequalities: 

-ln(o,) - l + Y,hfi,k = A A >0Ag k (hi) > F k 
k 

or 

-ln(o 4 .)-i+Ao = 0Ag k (hi)>F k 

Proposition 3.3 In both probabilistic and deterministic channels, the channel capacity without given 
knowledge is given by 

J>(1 "!>/(,*)<* 

i k 

In the case ofX^ = 0, for all k> that simplifies to 



where d = j-^. 



^(l-Ao) 
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Proof. 

H{h)-H{h\0) = H(h)+^jMog( — 



j,' 

Oi 



= H{h)-^jMog{^r)-H(h) 

j,i rj,i 

= £^log(^-) 

j,i rj,i 

= I^I^log(^) 

j °j 

( k 

where in deterministic channels </> ; - j = 1. In the case Xu = 0(k > 1) the expression becomes: 

rf£>(i-Ao) 

i 

which indicates one possible result. 
The proof completes. 



3.3 Comparison with The Results Using Lagrange Multipliers 

From Theorem |3.1 1 we notice that, in the solution of a constrained optimization problem, the inequality 
constraints either constrain the solution (i.e. A,- ^ f\gk{h) = Fk), or they do not (i.e. A,- = 0). If they 
do, we can use Lagrange Multiplier to find the optimal solution by treating the inequality constraints as 
equality ones; otherwise, the constraints do not affect the solution. So, does it mean that the channel 
capacity theorem deduced by KKT has no improvement upon 0121]]? The answer is no, because when 
there is a set of inequality constraints, it is difficult to determine which of them are constraining the prob- 
lem. Then a method of classification is necessary to check whether the inequality constraints constrain 
or not. This is exactly what KKT conditions are doing: whether the constraints constrain the maxima or 
not, KKT deals with them elegantly. 



4 Applications of the Results 



Theorem |3.1| and Proposition |3.3| can be applied in both programs and protocols to solve channel capacity 
with inequality constraints. In this section, two examples (a program and a protocol) will be studied to 



show how Theorem 3.1 and Proposition 3.3 are applied. The results are explained. Further, a short 
discussion is given on implementing this approach for automatic computation. 



4.1 Example: A Multi-threaded Program 

Let us start with a simple probabilistic nested multi-threaded program: 
l=h 7, 2 | (1=0 I 1=1) 
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h 


ho ,h) 




Kdd 


p(i- q ) + (i- p )(\ 


-q)p 


\-p{\-q)-{\-p){\-q)p 


heven 


\-pq-{\- 


p)pq 


pq+(l-p)pq 



Table 1 : A multi-threaded program: observations and probabilities 



Suppose that the outer thread has probability p to run first "l=h % 2" and the inner thread has probability 
q to run "1=0" before "1=1" . From the program we know that there are two possible observations: 0(C?o) 
and 1(0 1). We list all the possible values of h, observations and the conditional probabilities in Table [I] 
Assume h is strictly less likely to be odd than even, i.e. the constraint on the input is: 

Kdd < "■even 



Using Theorem 3.1 we get equations 



even \ / , 

A\-a)h odd + (l-b)h 

even \ . , 1 , 1 r\ 

-aln( ) — (1 — a)ln( ) — 1+Ao + Ai = 

even \ , , 

A\-a)h odd + (\-b)h 

even \ . , 1 i r\ 

-b\n{ )-(l-fc)ln( . — )-l+Ao-Ai = 

b 1 — b 

where a = p(\ — q) + (\ — p)(\ — q)p ; b= 1 — pq — ( 1 — p) pq. 

Firstly we consider the extreme case p = 1 and we solve the equation system to get 

Ao = l, Ai=0 



Using Proposition 3.3 we know that the channel capacity is 0. This is because when p = 1 which means 
"l=h % 2" running first then the program is secure because the result can not reveal any information of 
the secret. Now we suppose p = q = ^, and according to that we can solve a = 0.3704 b = 0.8148. 
Because the inequality is strict, we cannot have h odd = h even . Thus we consider if the other possibility in 
Theorem|3.1|Ai = can be satisfied and we find: 



h odd = 0.4836,^ = 0.5164, Ao = 0.8931, X y = 
This solution does satisfy Kdd < Kven an d the distribution is the one we are afteiQ Using Proposition 



3.3 we get the channel capacity: 

d(h odd (1 - Ao - Ai ) + Kven ( 1 - Ao + Ai ) ) = 0. 1069 bits 

The channel capacity is small, because among the three statements, only when "1 = h % 2" is run in the 
end the program leaks, and the leakage is 1 bit. The other two statements do not contribute to the leakage 
but further confuse the observation by producing same outputs and 1 , making the leakage even smaller. 



4.2 Example: Onion Routing 

Onion Routing [24] is designed to protect data and sender anonymity in communication over a public net- 
work such as the Internet. The general idea is, when a client (sender) wants to send a message to a receiver 
r, it will choose a path pi,... ,p n of routers and encrypt the message m as P\(. . . (P n (R(m),r)) ... ,2) 

2 Notice that values of h odl j > 0.4836 results in a lower leakage 
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2(h 2 ) 
4(h 4 ) 



Path 


O (in, out) 




1 - 


->2 


-^R 


(N, N) 


1 - 


^2- 


^3 


-^R 


(2, R) 


2- 


^4- 


^3 


-^R 


(4, R) 


2- 


^4- 


^3 


->R 


(4, R) 




2- 


^3 


-^R 


(2, R) 




3- 


^2 


-^R 


(N,R) 




4- 


->3 


->R 


(4, R) 


4- 


->3- 


->2 


->R 


(4, 2) 



\in,out),hi 



Table 2: Onion Routing: observations and probabilities 



where p (resp R) is the public key of the router i (resp receiver r). When the router pi receives 
Pi(. . . (P n (R(m),r)) ...,/ + 1) it will uses its private key to decrypt the message and will so get 

. . (P n (R(m),r)) ... so it will send the message Pj+i(. . . (P n (R(m),r)) ...,/+ 2) to Pi+\. Usual 

assumptions are: 

1. A circuit can be of any number of nodes as long as no node appears twice. 

2. The client never sends the message to the server directly. 

3. Observations of a node include the previous node and the next one. 

4. All paths are equally likely. 

If the attacker can observe one router p\ then there may be a loss of anonymity: the attacker is able 
to observe which node delivered the packet to it and which node the packet is then be delivered to. 

Here we will show how the loss of sender anonymity can be quantitatively analyzed using the defini- 
tion of channel capacity. We use the same simple Onion Routing network from Q as shown in Figure [T] 
but different and meaningful constraints will be demonstrated. The node "R" is the receiver. There are 4 
nodes 1,2,3,4 in which either of them can initiate the communication; node 3 is an adversary in the net- 
work. We list all the possible paths, observations on the adversary node and the conditional probabilities 
for the observations in the Table [2] 




Figure 1 : Example of An Onion Routing Network 
From the Table |2j we get o using oj = £ ; - <j)j/. 

°(N,N) = °(2,R) = + 2 2 ' 

0( 4R ) = -hi + -h 2 + -h 4 , o( NjR ) = hi, 0( 4j2 ) = -h 4 
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We now consider the case when an active user sends out messages more frequently than non-active 
users. Here we assume h\ has greater probability than the node h 2 . Then we have an additional constraint 
h\>h2 with the constraint Co : h\ + /?2 +/z 3 + /z 4 = 1. 

We use Theorem 3. 1 to get the following equations: 

^(In^+M^+ln^-l+Ao + A^O 

1 2 2 

-^0(^-1+^0 = 
-iln(542)-'ln(°<^)-l + A„ = 

L 2 L 2 

We firstly consider if the equality h\ = h 2 satisfies, then we solve the above equations and we find 
hi = 0.161AM = 0.1674,/j 3 = 0.3903, h 4 = 0.2750, 

Aq = 0.0591, Ai = -0.0072 

But this solution does not satisfy A > 0. 

Then we only consider the solution for the other possibility Ai = 0, and we get the results: 

h x = 0.1735, h 2 = 0.1603, h 3 = 0.3902,/j 4 = 0.2760, 

Ao = 0.0590, Ai =0 



This solution does satisfy hi > Using Proposition 3.3 we get the channel capacity: 

J(/ji(1-A)-Ai)+/i 2 (1-Ao + A 1 ) + (/j3+/j 4 )(1-Ao)) = 1.3576 bits 

When we have a strict inequality constraint, as we mentioned before, it may find an approximate 
solution in case if the accurate solution can not be achieved. The following example shows such a case. 
Here we use a similar constraint as above, assuming that the first node is 100 times likely to send the 
message compared to the second: 

hi > 100/j 2 



Using Theorem 3.1 we know that the second equation above becomes 

-^(^-^(^-I + Ao-IOOA^O 

L 2 2 

while the other three equations stay the same because the change of constraint does not affect them. 
From the above result we can know that the solution for the case Ao = does not satisfy h\ > 100/i2- We 
can use the equality constraint instead to find an approximate solution. Assuming hi = 100/Z2, we have 

hi = 0.2868, h 2 = 0.0029,/z 3 = 0.3979,/j 4 = 0.3125, 



A = 0.0783, Ai =0.0024 
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Using Proposition [33] we get the channel capacity: 

d(h (1 - Ao - Xi ) + h 2 ( 1 - Ao + lOOAi ) + (h 3 + h 4 ) ( 1 - Xo) ) = 1 . 3297 bits 

Note that this is an approximate solution achieved when 100/Z2 + £ = h\, £ — > 0. 

In the first case when the constraint is hi > h 2 , the channel capacity is 1.3576 bits. We compute 
the original secret of 1.9042 bits, which means the protocol leaks up to 72.2% confidential information. 
In the second case, where the constraint is hi > I00h 2 , the channel capacity is 1.3297 bits. Since the 
original confidential information is 1.5946 bits, the rate is increased to 83.4% which means the system is 
much more insecure. The reason is, hi and h 2 share the same observations as (4,R) and (2,R). Once the 
attacker observers these pairs, he/she has can more confidently guess the initial sender to be hi than h 2 
with knowledge of the constraint hi > I00h 2 . Thus, the constraint does affect the security of the protocol 
by reducing the confusion between hi and h 2 . 

In both cases, the channel capacity is around 1.3 bits, which seems to imply that the protocol is 
insecure. Two observations are in order. First notice that by repeating observations on these networks 
the loss of anonymity is not increased. Secondly in the real deployment of onion routing on the Internet 
(such as Tor), there are hundreds of nodes, with complex connectivity frequently updated; because of the 
number of possible connections in such large scale networks the channel capacity is very low. 



We have only one constraint in the above cases, but from the formula in Theorem 3.1 



o s eo t ° s k 

multiple constraints will only affect the last item Y,k^kfi,k in the equation system. The complexity is 
increased linearly by increasing the number of factors X^. 

4.3 A Note on Automatic Computation 

Automatic analysis of programs and protocols can be achieved in two steps. The first step is to analyze 
the program or protocol to deduce the statistical relationship between O and h. Recent works to automate 
this part include lFT4l[T6ll which tracks the analyzed program iteratively to derive a precise answer. Alter- 
natively, ||5][9l used simulations to derive an estimation. For the particular example of anonymity routing 
protocols, it is also possible to work out the statistical relationship based on the graph topology including 
vertexes, edges and adversaries. Based on the relationship, the equation system can be produced using 



Theorem 3.1 The second step is the automatic solution of the equation system. Automated solution of 



such an equation system has been implemented in standard mathematical packages, e.g. MATLAB. 



5 Conclusion and Future Work 

We apply Karush-Kuhn-Tucker conditions to solve the channel capacity of probabilistic information 
leakage channels with inequality constraints. We derived a series of theorems and propositions and we 
show how these results can be applied to programs and protocols. Our calculations provide general and 
accurate solutions to measure the maximum information leakage in a system. 

Our future work will investigate other continuous definitions of information leakage using Karush- 
Kuhn-Tucker conditions. Notably, we propose to solve the maximum ratio between the channel capacity 
of a leakage channel and that of the original secret, which in some cases could present a better definition 
of the worst case. Additionally, a comparison of the information theoretical and probabilistic analysis of 
probabilistic channels |[T2l ITTTl would also yield interesting results. 



Han Chen & Pasquale Malacaria 
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